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Supplementary methods 


SM-1. Perturbation expansion of finite-time transition operator and pairwise alignment 
probability: details 

Here, we apply the technique of time-dependent perturbation expansion ( e.g ., [29,30]) to our 
evolutionary model. We first re-express our rate operator as: 

Q' D (t) = Qo°(t) + Q'°(t). - Eq.(SM-l.l) 

(It corresponds to Eq.(R4.1).) Here Q^ D (t) = Q' x (t) + Q x (t) describes the mutation-free 
evolution, and Q'^(t) = + Q' Xf 0) describes the single-mutation transition between 

states. From the reduced form of Eq.(R3.6), we get: 

(s|Go D (0 = -R?(s,t)(s\, — Eq.(SM-1.2) 

with R"\s,t) m R' x (s,t) + R°(s,t) . — Eq.(SM-l .3) 

(Eq.(SM-1.2) and Eq.(SM-1.3) correspond to Eq.(R4.2) and Eq.(R4.3), respectively.) Using 
the decomposition, Eq.(SM-l.l), the forward equation, Eq.(R3.19), can be rewritten as: 

— P ID {t, t') - P ,D (t, t') Q^(t’) = P' D (t, t') Q'°(t r ). - Eq.(SM-1.4) 
at 

Now, let Pq D ( t\ t") = T | exp (J dr Qq D ( t) I i , and multiply it from the right of each side of 


Eq.(SM-l.4). Then, exploiting the equation, t") = - Q^W) P 0 /D (t', t "), we get: 

dt' 


d_ 

dt' 


{P ID (t, t')P 0 m (t',t")} 


P ,D (t, t') Qm (/) P ( ! D (A t") . - Eq.(SM-1.5) 


Integrating the both sides over time t' G [/, r"], using P"’(t, t ) = P 0 /D (t", t") = / , and replacing 

t" with t ', we finally obtain a crucial integral equation: 

P' D (t, t') = P’ D {t, t') + f' dr P ID (t, r) Qm(t) P 0 ID (r, t'). - Eq.(SM-1.6) 

(It corresponds to Eq.(R4.4).) Similarly, starting from the backward equation, Eq.(R3.20), we 
can obtain another crucial integral equation: 

P ID (t,t') = P' D (t,t') + f'dTP' D (t,T)Q I °(T)P ,D (r,t'). — Eq.(SM-1.7) 

(It corresponds to Eq.(R4.5).) These equations are equivalent to the defining differential 
equations, Eqs.(R3.19-21), because the former were directly derived from the latter. (And the 
latter can also be derived from the former.) 


2 


Now, to formally solve Eq.(SM-l .6), we assume that the solution can be expanded 


as: P ID (t, t') = q P(N)(^ 0 , where P'^it, t') is the collection of terms containing N 


indel operators each. Substituting this expansion into Eq.(SM-1.6) and comparing the terms 
with the same number of indel operators, we find the equations: 


p ,D ( t t') = P' 

1 (0) V*? 1 ) 1 0 




P"LSP 0 = f' drP^it, r) Q%(r) P’ D ( t, f) . - Eqs.(SM-l .8,9) 


Using Eq.(SM-1.8) as an initial condition, Eq.(SM-1.9) can be recursively solved to give: 
PS)M- /•••/ dr, ■■■dr N P ( t n (t, r,) T Q" ; (r v ) P”\r v , r v+l )} -Eq.(SM-ElO) 

t<T x <--<T N <T N+l =t' 

for N > \ . Substituting this back into the above expansion, we finally get the formal 
perturbation expansion of the finite-time transition operator: 


P ,D (t, t') = P' D (t, t') + 2 f ■ ■ •/ dr x -dr N P' D (t, U) ( r v)^(^ 


N=\ t< t, < 



= P^(t„ t F ) + j' dr P' D (t, r) Q™(r)P 0 !D (r, t') 

+ ff dr x dr 2 P’ D (t, r,) Q™(t, ) P 0 m (r, , t 2 )Q“(t 2 ) /f (r 2 ,/') 

t<T l <T 2 <t' 


fff dr x dr 2 dr. Pj D (t, r x ) g" J (r x )P''\r x , t 2 )2®(t 2 )P 0 /d (t 2 , t 3 )Q®(t 3 )P 0 /d (t 3 , /') + ■ 


f<Ti <To <T?<t 


— Eq.(SM-l.ll) 

Note that Eq.(SM-l.ll) can be derived also from Eq.(SM-1.7). Because of Eq.(SM-1.2), the 
equation: 

(s\pQ D (t, t') = exp|-J dr R'x(s, t)| (s\ — Eq.(SM-1.12) 

always holds for every state s G S" and any time points (/, /') G [/,, t F ] 2 (with t <t'). Thus, 
pQ D (t, t') describes the state retention during the time interval, [/, /'], with the retention 
probability exponentially decreasing at the exit rate ( R'x(s, r) ). Therefore, the N -th term in 
the solution, Eq.(SM-l.ll), literally describes the evolutionary processes where the sequence 
underwent exactly N mutations. In his theorems 1 and 2, Feller [35] mathematically proved 
that the conditional probability, Eq.(R3.17), obtained by substituting Eq.(SM-l.ll) for 
P ID (t, t') is the solution of the defining time-differential equations of a continuous-time 
Markov model (the probability versions of Eqs.(R3.19-21)). In his paper presenting a widely 
used method for stochastic simulations, Gillespie [34] in effect gave a more intuitive 
derivation of the solution. Gillespie’s method is crucial for molecular evolutionists, because it 
gives the basis of the genuine molecular evolution simulators ( e.g ., [26,27,28]). Our 
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derivation of the solution, Eq.(SM-l.ll), serves as a bridge between Feller’s mathematically 
rigorous proof and Gillespie’s intuitive derivation. Ours also helps understand the situation 
underlying Feller’s theorems and gives an intuitively clearer view via the neat operator 
representation of the solution. [NOTE Besides, our derivation via perturbation expansion is 
more flexible than theirs, because our method can go beyond the separation of exit rate terms 
from transition terms (see, e.g., [31]).] 

Now, examine the action of Eq.(SM-1.11) (with (t, t') replacedby (tj,t F ))on 
every basic state s 0 E S" . To simplify the argument, we symbolically rewrite the action of 

Qm(0 s Gm(0 + Gm( 0 on a bra-vector (s| as: 


{s\Qu(t) = J r(M-s,t){s\M. — Eq.(SM-1.13) 


Here, M m [L] = \M,(x, /)k si U Im d (x b , jc £ ) 1 denotes the set of insertion and 

1^/ x b ^L, 

deletion operators that can act on the sequence of length L , and r(M; s,t ) denotes the 

(generally time- and basic-state-dependent) rate parameter of the indel operator M . Now, 
operating each term of Eq.(SM-1.13) on (y, |, replacing (/, t') by (/,, t F ), and applying 

Eq.(SM-1.12) and Eq.(SM-1.13) alternately, we finally get: 


( s 0 1 t F ) = dr Rf (y, r)J (s 0 


-2 2 ' 

N =1 


[([3T P M 2 ,..., M n ], [tj,t F ij | G 0 , t,) (y 


0 1-^1 ''-M n 


— Eq.(SM-1.14) 

Here, H ,D (N\ s () ) denotes the space of all possible histories of N indels each that begin 
with the sequence state s 0 . And 


’[([M,, M 2 , •••, M N ],[t n t F fj | (s 0 , t,) 

J...J dr l -dT N lY] N v= r(M v ;s v _ l ,r v )\ expj - J fj* dr R"\s v , x) j 


U ~ T 0 <T 1 <■ ■ ' <t N <t N +1 -t F 


v=0 


(s v |-(s v -i|Mv I v=l. n\ 


— Eq.(SM-1.15) 

(, which corresponds to Eq.(R4.7),) is the probability that an indel history [M,, M 2 , • • •, M N ] 
occurred during the time interval \t n t F ], given an initial sequence state .v 0 at time t : . 
Eqs.(SM-1.14) supplemented by Eq.(SM-1.15) gives a considerably concrete expression of 
the solution of the defining equations, Eqs.(R3.19-21), of our genuine stochastic evolutionary 
model. (See subsection 3.1 of [32] for a more detailed explanation of Eqs.(SM-1.14,15).) 
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Now, let H /D (iV = 0; s 0 ) = {(s 0 , [])} be the set consisting only of the history with zero indel, 


[ ], starting with the state 


. We can interpret exp dt R x( S o’ T ) 


as the conditional 


probability of this zero-indel history, P[([], [t 7 ,t F ]) | (s 0 , t,)]. Thus, Eq.(SM-1.14) can be 
rewritten more neatly as: 


N =0 [M 1 ,M 2 ,-,M iV ]6H' i> (I¥;s 0 ) 

— Eq.(SM-1.14’) 

(It corresponds to Eq.(R4.6).) 


([M v M 2 ,...,M N ],[t I ,t F ]Y(s 0 ,t I ) {s 0 \M l M 2 ---M N . 


Now, substitute an “ancestral” sequence state, s A (e S 11 j, for s 0 in Eq.(SM-1.14’), 


and take the inner product between it and the ket-vector, , of a “descendant” sequence 
state, s D (e S" j. This procedure gives the finite-time transition probability, 


P ID (t I ,t F )\s D ) = P (s D ,t P )\(s A ,t I >] , as the summation of probabilities over all possible 


indel histories consistent with the ancestral and descendant sequence states. As exemplified 
by Eq.(R2.1), the comparison of s D with ,v' 4 uniquely determines the pairwise sequence 
alignment (PWA) between them, with a definite homology structure [48], Let a(s A , s D ) 
denote (the homology structure of) such a PWA. Then, summing the above transition 

probability, (s A |P /D (t / , , over all “equivalent” s D ’s providing a(s A ,s D ) gives 


(a(5' 4 ,5 D ),[t / ,t /r ])| (s A ,tj )], which is the probability that a(s A ,s D ) resulted from the 


evolution during the interval [t ; , t F ], given s A at /,. By analogy to the derivation of 
Eq.(SM-1.14’), we obtain the formal expression of this probability as: 


oo 

(a(s A ,s D ),[/V,t F ])| (sV 7 )]= ^ 




(t 


M,,M 


2 ’ 


,M N Ut n t F ] (s A ,t 7 ) 


— Eq.(SM-1.16) 

(It corresponds to Eq.(R4.8).) Here, H /D |^A; a(s A , ,v /J )J denotes the set of all histories with 
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N indels each that can result in a(s A , s D ) , and iV min s D )] is the minimum number of 

indels required for creating the PWA. Now, introduce the symbol that represents the set of all 
global indel histories consistent with a(s A , s D ): 

ft”K.O]-U^V, - Eq.(SM-1.17) 

Then, Eq.(SM-1.16) can be further simplified as: 

p[(a(s A ,s D ),[r / ,y) | (s A ,t I )]= J P[([M v M 2 ,---,M N ],[t n t F ])\(s A ,t I ) . 

[M i ,M 2 ,--,M n ] 

— Eq.(SM-1.16’) 

(It corresponds to Eq.(R4.9).) Eq.(SM-1.16) and Eq.(SM-1.16’) are the formal expressions of 
the occurrence probability of PWA a(s'\ s D ) derived purely from the defining equations, 
Eqs.(R3.19-21), of our evolutionary model. Thus, they are the ‘'ab initio probability” of the 
PWA. In section SM-2, we will examine its factorability. 


SM-2. Factorability of pairwise alignment probability: details 

Here we examine the factorability of the ab initio probability of PWA a(s A , s D ) 


P 
(t,). 


(a(5 A ,5 D ),[t 7 ,r f ])| (s A ,t, )] in Eq. (R4.9), given the ancestral state ( s A ) at the initial time 


As mentioned in section R6 of Results and discussion, each component probability, 

given by Eq.(R4.7), will not be factorable. This is 


{[M l ,M 2 ,---,M N ],[t I ,t F ] ) j\(s A ,t I ) 


because its domain of multiple-time integration is not a direct product. Thus, we will need to 
combine the probabilities of a number of indel histories. How can we do this? As mentioned 
in Section R5, each indel history, [M,, M 2 , • ••, M N ], belongs to a LHS equivalence class 

represented, e.g., by a LHS, j M[k,l \,..., M[k, AT t ]j| , which will be abbreviated as M 


hereafter. Let 


M 


denote this LHS equivalence class .If [M x , M 2 , • • •, M N ] can yield 


1LHS 


a(s A , s D ), so can every element of the LHS that [M l ,M 2 , ■■■, M N ] belongs to. Thus, 


obviously, we have 


M 


J LHS 


C H /d [a(s A , 5°)j for every 


M 


containing an indel history 


J LHS 


that can yield a(s A , s D ) . Next, if the two indel histories connect with each other through a 
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series of binary equivalence relations, Eqs.(R5.2a-d), the two histories belong to the same 
LHS equivalence class. These facts mean that the set H /D [a(s A , .s H )j of all histories 
consistent with a(s A ,s D ) can be decomposed into a direct sum: 


H /D [a(/,5 D )] 


U 



. — Eq.(SM-2.1) 


(It corresponds to Eq.(R6.5).) Here, A ID [a(s A , 5 D )j is the set of all LHSs consistent with 
a(s A , s D ). This enables us to further rewrite the PWA probability, Eq.(R4.9), as: 


p 

\a(s A ,s D ), [t n t F ]) (s A ,f,)] = J P | 

M 

9 [^/9^/r] J 

(s A ,t I ) 


M' D [«(/,! D )] L 

U - 

LHS ) 

■ 


— Eq.(SM-2.2) 

(It corresponds to Eq.(R6.6).) Here, 


P 

M 

9 J 9 tj) 

hi 

M 

l '0 

\\M l ,M 2 ,--,M N ],[t I ,t F ])\(s A ,t I ) 

- 

Vl - 

LHS / 

[ Mj, M 2 , • • •, M n ]E M 



L J LHS 


— Eq.(SM-2.3) 

(, which corresponds to Eq.(R6.1),) is the “total probability” of 



. Therefore, if 


I LHS 


Eq.(SM-2.3) can be factorized for every LHS M E A 1D [a(s A , s D )J, the PWA probability, 

Eq.(SM-2.2), may also become factorable. 

To examine the factorability of Eq.(SM-2.3), it is convenient to consider the 

quotients: 


= P 


Pp 


Pp ([M l ,M 2 ,---,M N ],[t n t F ])\(s A ,t I ) 

\[M l ,M 2 r--,M N Ut I ,t F ])\(s A J I )yp[([]At I J F ])\(s A J I ^ 

mi],.... m,N k ]\,[*,,*,])| (/, o ] / p[([],[t 7 ,t F ]) | (/, t,)] 


Eq.(SM-2.4) 


Eq.(SM-2.5) 


and 


Pp 


M 


,[t,,t F ] (s A ,t,) 


J LHS 


= P 


M 


i[t/,(s , tj) 


J LHS 


(□,[^])|(sV;)’ 
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— Eq.(SM-2.6) 

and focus on the relationships between Eqs.(SM-2.4-6). (Eq.(SM-2.5) and Eq.(SM-2.6) 
correspond to Eq.(R6.3) and Eq.(R6.4), respectively.) This is because Eq.(SM-2.4), for 
example, can be expressed as: 


f^p 




m n ], 


[^/ ? t f 


])|(Ac) 


U ~ T 0 <T 1 <- ’ ' <r N <t N +1 ~ f F 


(U:,r(M v ; s„,4„)) ex p|~2 J? + ' dr 8R x (Sv ' s *’ T ^| 

k=/. 

' ; l v=0 ^ J 

|(i v K s v-i|^v| v -! .V 


,— Eq.(SM-2.7) 

where 8Rf (s, s', r) = Rf (s, r) - Rf (s', r) is an increment of the exit rate. A similar 
expression applies also to Eq.(SM-2.5). Compared with Eq.(R4.7) (or Eq.(SM-1.15)), the 
merit of Eq.(SM-2.7) is that it enables us to focus on the regions of the sequence where the 
indels took place, if the evolutionary model has desirable properties (revealed below). Thus, 


foraLHS, M = 


M[k, 1],..., M[k, AC]lf , we will set the following ansatz: 
-IJ/M. K 


[( 

M 


L\ 

- 

LHS ) 


A- 

Da 


k =1 


M[k,N } 




(^ 



— Eq.(SM-2.8) 

(, which corresponds to Eq.(R6.2),) and seek to find a set of conditions under which it indeed 
holds. To get a hint on the conditions, we will look at the both sides of Eq.(SM-2.8) more 
closely. Using Eq.(SM-2.3) and Eq.(SM-2.7), the left hand side of Eq.(SM-2.8) can be 
rewritten as: 



M 

,\t I ,t F ]\\(s A ,t I ) 

2 ^ p 

\[M l ,M 2 ,--,M N ],[t I ,t F ])\(s A ,t I ) 

- 

Vl - 

LHS ) 

]eU| 



l Ilhs 


/•••/ Jr, ■ dr A 


[ M ,, M 2 , ■ • ■, M n ]E| Ml 'i- t o <T l <■' ■ <T iV < ' r »+l-'F 


n N * 

v j(M v ;s v _ i,r v ) 


x exp \ dr 8Rf (s v , s A ,r)\ 


v=0 


s 0 =s , 

( s v | = ( 5 v-l |^v 

for v=l,...,N 


— Eq.(SM-2.9) 

Meanwhile, the right hand side of Eq.(SM-2.8) can be rewritten as: 
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Y\ix p [([M[*,l],.... M[k,N k ]\, | (s A , t,) 

k =1 L 


i\ 

-n 


k=l 


f'' 'f dr(k,l) dr(k,N k ) (n ^_r^M[k, 4]; s ik _ x ,r(k, 4)) 

4 = t (£, 0 )< t (&, 1 )<- • • <r(k,N k )<r(k,N k +l)=t F 


xexp 


_y r^ +1) dTdR iD^ a ) 


4=0 


(*oHy A |, 

/or i. I.V. 


— Eq.(SM-2.10) 

As we can see, Eq.(SM-2.9) and Eq.(SM-2.10) are quite similar. Each term in either 


expression is integration over N |= ^ ^ N k j time variables. And each history, 

[M,,M 2 , • ••, M n ] , in Eq.(SM-2.9) is nothing other than a rearrangement of the equivalents of 


the events in the LHS, M = 


M[k,l],M[k,N k ]\\ . Therefore, if the following two 

L -IJ k=\ . K 


equations hold, the ansatz Eq.(SM-2.8), will also hold. 
(A) The equation between the domains of integration: 


/•••/ dx l ■d.T N (...) 


[ M, , M 2 , ■ • ■, M n ]e| M '' “ T o <ir i < " - <t n< t n*i -<F 


A 

- n 


k =1 


/•••/ 


dr{k,\)-dr{k,N k ) 


tj=T(k,0)<T(k,l)<---<T(k,N k )<T(k,N k +l)=t F 


(...) 


(B) The equation between the integrands (i.e., the probability densities): 



(NOTE: Here, the equations were deliberately given in a rough manner, to aid the reader’s 
intuitive understanding. Supplementary appendix SA-2.1 in Additional file 2 gives their 
mathematically rigorous forms.) Considering that a LHS equivalence class contains all 
possible local-order-conserving rearrangements of events in the representative LHS, equation 
(A) is intuitively very plausible. However, its mathematically rigorous proof is not so 
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straightforward, and is given in Supplementary appendix SA-2.2 in Additional file 2. 
Equation (B) might be intuitively less plausible, because of the differences in 8R" } (s, s', r) 
on both sides. Nevertheless, we can prove that it also holds, provided that the following set of 
conditions is satisfied. 


Condition (i): The rate of an indel event ( r(M v ; s v _ r T v )) is independent of the portion of the 
sequence state (s v _j) outside of the region of the local history the event ( M v ) belongs to. 
Condition (ii): The increment of the exit rate due to an indel event ( 8R'^(s v , ,v v _, , t) , with 

(s v | = (s vA \M V ) is independent of the portion of the sequence state (s v _,) outside of the region 

of the local history the event ( M v ) belongs to. 


See Supplementary appendix SA-2.1 and SA-2.3 in Additional file 2 for the derivation of the 
mathematically rigorous version of this set of conditions. (For illustration, in Supplementary 
methods SM-3, the factorability of the probability will be examined for the simplest concrete 
LHS equivalence class (given in Figure 5).) 

Once the factorability, Eq.(SM-2.8) (or Eq.(R6.2)), is established for each FHS 
equivalence class, it is relatively easy to prove the factorability for the total quotient for the 
PWA: 


fi p [(a(s A ,s D ), [t n fj) | (s A , t r )] = />[(«(/,s D ), [ t „ f F ]) | (s A , *,)]//>[([], \t„ t F ]) | (s A , t,)] 


2 ^p 





— Eq.(SM-2.11) 

(which is equivalent to Eq.(R6.6) (or Eq.(SM-2.2)). Thanks to Eq.(SM-2.8) (or Eq.(R6.2)), 
each summand on the rightmost side is already factorized. One caveat, however, is that the set 
of local-history-accommodating regions could vary depending on the FHS, even if the 
resulting PWA is the same. This is because we are considering all indel histories, including 
non-parsimonious ones, that can yield the PWA, a{s A ,s D ). [NOTE: Some non-parsimonious 
indel histories contain local histories in between contiguous PASs, such as 


M,(x , /), M d (x + 1, x + /) , which leave no traces of their own occurrences. They vary the set 


of regions accommodating local histories.] We will choose the maximum possible set of 
PASs in the given PWA, which separates the PWA into the finest potentially 
local-history-accommodating regions. [NOTE: Such a maximum set does not necessarily 
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consist of all PASs in the PWA. An example is given in subsection R8-3.] Let y v y 2 ,..., y K 
be such regions, where the number of regions, fc max , is uniquely determined by the PWA and 
the evolutionary model. Then, we can represent any 


M = 


as a vector with k 


M 


= Im[ Y i \, M[y 2 \,..., M\y K< J ). Here M[yJ = [M[k,l],..., 


max components: 

if the k th local 


history is confined in region y K , or M[y K I = [ ] (empty) if no events in the LHS occurred in 


y K (Figure SI). Then, keeping u p [([ ], [t n t F ]) | (s A , t,)] = 1 in mind, the factorability, 
Eq.(R6.8), can be re-expressed as: 


f^p 


M 


5 [t/ , tf ] 


(s\t,) 


n 11 r 


M[y K ],[t n t F ])\(s A ,t I ) 


. — Eq.(SM-2.12) 


J LHS 


K= 1 


Now, consider the space A ID ja(s A , ,s -/J )j itself. Any two different LHSs in this space differ 
at least by a local history in some y K . Conversely, any given vector, 

/ A A A \ A 

I A/f/j], M[y 2 \,..., M[y K ] , each of whose component ( M[y K ]) is consistent with the PWA 


restricted in the region ( y K ), defines a LHS in A ,D [a(s A , s D )]. Thus, the set A ,D [a(s A , s 1 *)] 
should be represented as a “direct product”: A /D [a(s A , s D )] = x A ,D [ y K ; a(s A , 5 D )j, where 

K=l 

A ID [y K ; a(s A , .v)J denotes the set of local indel histories in y K that can give rise to the 


sub-PWA of a(s A ,s D ) confined in y K . Using this structure of A /D [a(s A , 5°)] and 


substituting Eq.(SM-2.12) for each M E A ID [a(s A , s D )J into Eq.(SM-2.11), we finally get 
the desired factorization of the PWA probability quotient: 


Ap 


_ _ max V / ~ \ 

(a(5 A ,5°),[C,C])| (5 A ,C)J = ]^[ 1 d /) MA /D [y t ;a(5 A ,5 D )],[t / ,t F ]j| (s A ,t,) 


— Eq.(SM-2.13) 

Here the multiplication factor, 
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Pp 


( a ,d [y k ; a(s A , 5°)], [t,,t F ]) | (s' 4 , tj) 




M[YMtnt F ]\\{s A ,t I ) 


— Eq.(SM-2.14) 

(, which corresponds to Eq.(R6.8),) represents the total contribution to the PWA probability 
by all PWA-consistent local indel histories that can take place in y K . Finally, the definition of 
the PWA probability quotient, Eq.(SM-2.11), transforms Eq.(SM-2.13) into the following key 
equation for the factorable ab initio PWA probability: 


P 
= P 


r -i ^inax p - 

(5 A ,t / )J]^[/ip[(A /D [y^;a(5 A ,5 Z5 )],[t / ,t F ])| (s A ,t,) . 

K =1 

(It corresponds to Eq.(R6.7).) 


Eq.(SM-2.15) 


SM-3. Factorability of probability of simplest LHS equivalence class 

To illustrate how the factorization, Eq.(R6.2) (or Eq.(SM-2.8)), can be satisfied, here we will 
examine the probability of the simplest concrete LHS equivalence class, 


M d C 2,4)], [m 7 (6, 3) 


(Figure 5). In this example, the two constituent indel histories, 


LHS 


M d (2, 4), M f (3, 3)j and M, (6,3), M D (2, 4) j , share the ancestral state, 
s A = [l, 2,3, 4, 5, 6, 7], and the descendant state, s D = [l, 5, 6, 8,9, A, 7]. In addition, the 
histories have their own intermediate states, (s a | = (s A |m d ( 2, 4) (= ^[l, 5, 6, 7]|j and 

(s b | = (s A |m ; ( 6, 3) (= ^[l, 2, 3, 4, 5, 6, 8, 9, A, 7]|), respectively (Figure 5, panels a and b). 

Using Eq.(SM-2.7), the probability quotient of the first indel history is given by: 
([M d ( 2, 4),M I (3,3)],[t I ,t F ])\(s A ,t I ) 

r D (2,4;s A ,r 1 )r / (3,3; s a ,r 2 ) 

x exp j-J dr 8Rf (s a , s' 4 , x)- J F dr ( s D , s' 4 , r)j 


Pp 

= JJ Jr, dr 


t I <T l <T 2 <t F 


= JJ Jr, dr 2 


t I <T l <T 2 <t F 


r D (2,4;s A ,r,)r,(3,3; s a ,r 2 ) 

x expj - J dr 8R^(s D , s a ,r) - J dr 8R^(s a , s A ,r) 


— Eq.(SM-3.1) 


To get the rightmost side, we used the identity: SR "(s , s , r) = 

dRf (s°, s a ,r) + 8Rf (s a , s A ,r). Similarly, the quotient of the second indel history is: 
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ftp 


\[M,{6,3),M D {2,4)\,[t I ,t F ])\{s A ,t I ) 


= JJ Jr 2 Jr, 


tj<T2 <r l <? F 


r 7 (6,3;s A ,r 2 )r D (2,4; s b ,xf) 

x expj-J' Jr 8R x {s b , s A , r) - J ' Jr 8R' x (s D , s A , r) 


= JJ Jr 2 Jr, 


ti<r 2 < t l <t F 


r 7 (6,3;s A ,r 2 )r D (2,4; s b ,xf) 

x expj - J ' Jr 8R^(s b , s A ,r)~ J F Jr 8R x D (s D , s b ,r) 


Eq.(SM-3.2) 


The total quotient of the subject LHS equivalence class is the summation of Eqs.(SM-3.1,2). 
We first notice that, modulo differences of measure zero, the union of the two domains of 
integration is a direct product: 


{(r,, r 2 ) 1 1 , < r, < r 2 < t F ] U {(r„ r 2 ) 1 t 7 < r 2 < r, < f F } 
= { r i | b < r, < t F }x {r 2 1< r 2 < t F } 


Eq.(SM-3.3) 


Thus, the total quotient can be factorized as: 

U r [([{[M fl (2,4)], [M 7 (6,3)]}]^, [t,,t F ]) | (s A , t,) 

/' dr, r D (2,4;s A ,r 1 ) expj-J”* Jr 8R x (s a , s A ,r) 

J F Jr 2 r 7 (6,3; s a ,t 2 ) expj-J” f Jr 8Rf (, s b , s A ,r) 

= /i, [([M d (2, 4)], [t 7 ,t F ]) | (s A , t,)] u P [([M 7 ( 6,3 )Utj,t F i) | (5 A , f) 
— Eq.(SM-3.4) 

provided that the following equations are satisfied: 

r D (2,4;s b ,x l ) = r D (2,4;s A ,x 1 ) , — Eq.(SM-3.5a) 

r 7 (3,3;s a ,r 2 ) =r 7 (6,3;s A ,r 2 ) , — Eq.(SM-3.5b) 

8Rf{s D ,s h ,r) = 8Rf{s a ,s A ,r) , - Eq.(SM-3.5c) 


8R l x u {s u ,s a ,r) = 8R‘ x u (s b ,s A ,r) . - Eq.(SM-3.5d) 

Eq.(SM-3.5a) and Eq.(SM-3.5b) correspond to condition (i) in section R6 of Results and 
discussion. And, owing to the above definitions of s a and s b , and to the equations 


I = (s a |M 7 (3, 3) = (s b \M d (2, 4), we see that Eq.(SM-3.5c) and Eq.(SM-3.5d) correspond to 


condition (ii) in section R6. Eq.(SM-3.4) is a concrete instance of the factorability, Eq.(R6.2) 
(or Eq.(SM-2.8)), when M = j M D ( 2, 4)j, |^M 7 (6, 3)j| . If you will, the factorability for more 


complex LHS equivalence classes could also be demonstrated concretely, although the 
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procedure becomes more cumbersome and lengthy. In any case, the proof can be generalized, 
as is fully described in Supplementary appendix SA-2 in Additional file 2. 

SM-4. Factorability of multiple sequence alignment probability: details 

As in section R7 of Results and discussion, here we formally calculate the ab initio 
probability of a MSA given a rooted phylogenetic tree, 7’ = ({n} T , {/?}, ), where {n} T is the 
set of all nodes of the tree, and {b} T is the set of all branches of the tree. We decompose the 
set of all nodes as: {n} T = N in (T) + N X (T) , where N /A, (T) is the set of all internal nodes 

and N X (T) = j« p ..., n^ :X j is the set of all external nodes. (The N x = |n x (T)| is the number 

of external nodes.) The root node plays an important role and will be denoted as n Root (T ), or 
simply n Ro °' . Because the tree is rooted, each branch b is directed. Thus, let n A (b) denote 
the “ancestral node” on the upstream end of b , and let n D (b) denote the “descendant node” 
on the downstream end of b . Let s{n) E S 11 be a sequence state at the node n E {n} T . 

Especially, let s A (b) = s(n A {b)^ E S u denote a sequence state at n A (b) and let 
s n (b) = s^n D (b)j E S" denote a sequence state at n D (b) . Finally, as mentioned in 
Background, we suppose that the branch lengths, j|Z?| | b E {/?} 7 j , and the indel model 

parameters, {@ /0 (^)} :r = {® m (b) | b E {b } T } , are all given. Note that the model parameters 

Q /w (/?) could vary depending on the branch, at least theoretically. 

First, we extend the ideas proposed by [13,14,36] to each indel history along a tree, 
by regarding the indel history along a branch as a map (or a transformation) from the ancestral 
sequence state to the descendant sequence state, as follows. An indel history along a tree 
consists of indel histories along all branches of the tree that are interdependent, in the sense 
that the indel process of a branch b determines a sequence state s D (b) at its descendant 
node n D (b) , on which the indel processes along its downstream branches depend. Thus, an 
indel history on a given root sequence state s Root = s(n Root ) E S n automatically determines 

the sequence states at all nodes, js(n) E S" for v /z G {n} r |. Fet H /D (5 0 ) = [J H //J (/V; ,v (l ) 

(with H ,d (N; s 0 ) defined below Eq.(R4.6)) be the set of all indel histories along a time axis 

(or a branch) starting with state s 0 . Then, each indel history, |m(/?)| , along tree T and 
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starting with s Root can be specifically expressed as: 


M(b) = 

Mfb ),..., M NW (b)] G H /d (5' 4 (i>)) and 

( Root /t\\ Root 

5^7? (T)| = 5 

(s D {b)\ = (s A {b)\M,(b)-M N(b) (b) for y bG{b} T 



. — Eq.(SM-4.1) 


(It corresponds to Eq.(R7.1).) Here, the symbol, M v (b ), denotes the v th event in the indel 
history along branch b G {b} T . The probability of the indel history, Eq.(SM-4.1), can be 
easily calculated. First, we already gave the conditional probability of an indel history during 
the time interval [t n t F ], by Eq.(R4.7). Because we can correspond each branch b G {b}, to 


a time interval \t(n A {b)), t(n D (b))^ (with t(n D (b))-t(n A (b)) = |(?|), the probability of an 


indel history, M{b)= M x (b ),..., M N(h) (b)^ G H /D (s A (Z>)), along a branch b£E{b} T is given 


by: 

P 


M{b), b \ I ( s A (b ), n A {b)) 


= P 


i^M,{b)r--,M NW {b)]\t{n A m, t{n D {b))^\{s A {b)j{n A m) 


. —Eq.(SM-4.2) 


9m(b) 


(It corresponds to Eq.(R7.3).) Here we explicitly showed the branch-dependence of the model 
parameters. Using Eq.(SM-4.2) as a building block, the probability of the indel history along 


T , <{ M(b) ^ , specified by Eq.(SM-4.1) (or Eq.(R7.1)), is given as: 


( 


P 

\M{b )} 

^ s R °o^n R ° o1 } 

= 

n r 

[M{b),b\\{s A {b),n A m\ 


. 1 Jr 



(b<E{b} T 

\ 1 1 J 

/ 


\ 


(s D (*)]=(/ (fc> ■ M Nm (b) 

for * bE{b} T 

— Eq.(SM-4.3) 

(It corresponds to Eq.(R7.2).) 

In this way, we can calculate the probability of any indel history |m(£>)| along tree 


T starting with a given root state, s Rm " G S". 

Now, an important fact is that an indel history, along a tree starting with a root 
sequence state, uniquely yields a MSA, a[s v s 2 ,..., s^ x \, among the sequences at the external 

nodes, y = s(n t ) G S" ( n i G N A (T)). [NOTE: Remember that the term “MSA” here means 
its homology structure.] However, the converse is not true. That is, a given MSA, 

«[.s p s 2 ,..., s nX ], could result from a large number of alternative indel histories along a tree, 

even when starting with a given sequence state at the root. Moreover, there could be infinitely 
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many root states consistent with a given MSA. Here, let 


be a pair of a root 


s Root , 



state and an indel history along T starting with the state. And let x P / °[ci![,s 1 ,,s 2 ,...,,5' ivX ]; r] 


be the set of all such pairs defined on T consistent with a[5 p s 2 ,..., s^*]. Then, as the 
probability of a given PWA is expressed as Eq.(R4.9) supplemented with Eq.(R4.7), the 
probability of a given MSA under a given model setting (including T) should be expressed 
as: 


P [«[«!, S 2 


|r] = 

2 p 

'( s Root ,n Root )]p 

\M(b)\ \(s Roo, ,n Roo, \ 


(,-,{*») r ) 

A / J 

[ J T ' ' . 


6^ ro [a[i lA . V ];r] 


— Eq.(SM-4.4) 

which (, corresponding to Eq.(R7.4),) is supplemented with Eq.(SM-4.3) (or Eq.(R7.2)). Here, 



„ Root 1 


n 


is the probability of state s Root at the root node ( n Root ). (It may be 


interpreted as the prior in a Bayesian formalism.) If you will, Eq.(SM-4.4) supplemented with 
Eq.(SM-4.3) could be interpreted as the “perturbation expansion” of an ab initio MSA 
probability. To make this formal expansion formula more tractable, we consider the ancestral 

sequence states at all internal nodes, and let {^(n)}^ = |s(n) G S | n G N A (T) j- denote a set 


of such ancestral states (or, more precisely, its equivalence class in the sense of endnote (h) 
(or 8)). To be consistent with a given MSA, the ancestral states must satisfy the “phylogenetic 
correctness” condition in each MSA column [37,38]. [NOTE: The “phylogenetic correctness” 
condition guarantees that the sites aligned in a MSA column should share an ancestry. The 
condition could be rephrased as: “if a site corresponding to the column is present at two 
points in the phylogenetic tree, the site must also be present all along the shortest path 
connecting the two points.”] As long as the condition is fulfilled in all MSA columns, 

however, any set of states must be allowed. So, let jn G N /,v (r)|; rj be 

the set of all {^(/i)}^’s consistent with af^,^,...,^] (and tree T). Then, the 

aforementioned set, W ID [a[5 1 ,s 2 ,...,s AfX ]; T j, can be uniquely decomposed into the following 


direct sum: 
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^ /D [a[5 l ,5 2 ,..., V ];r]= |J ^ /Z) [a[ Sl , S2 ,..., V ];{ S (n)} NW ;r] . 

£ 2[a[^i,i 2 . $ N x ]J jnEN /iv (r)|; T ] 

— Eq.(SM-4.5) 

Here, ^ /D [a[s p ,s 2 ,...,s w *];{,s(n)} NW ;r] denotes the set of indel histories along T 

consistent with both the MSA (a[sj,s 2 ,...,s x ]) and the ancestral sequence states (|s(n)} NW ). 
Substituting Eq.(SM-4.5) into Eq.(SM-4.4), we have: 

P[a[s 1 ,s 2 ,...,s wI ]| r]= 21 P[a[s 1 ,s 2 ,...,s JVl ];{s(n)} NW | r] . 

b(«)} N /« 

e z[a[s, ,« 2 ,...,s nX ]; {hEn^V)}; r] 

— Eq.(SM-4.6) 

(It corresponds to Eq.(R7.5).) Here, 

P[a[s 1 ,s 2 ,...,s wI ];{s(n)} N „ | T] 

= 21 P[(s Roo \n Roo, )]p lM(b)\ \(s Roo \n Root ) 

E>P ,z, [a[ J, ,s 2 ,...,s nX ]; P(n)} N /«; r] 

— Eq.(SM-4.7) 

is the probability of simultaneously getting «[.v p s 2 ,..., s x ] and {.s'(/i)} N , v . Thus, all terms in 
Eq.(SM-4.7) share the same homology structure among sequence states at all nodes. 
Especially, the sequence states at internal nodes have homology structures (with states at 
other nodes) fixed for respective nodes. And each history consists of indel histories along 
branches consistent with each other (as in Eq.(SM-4.1) (or Eq.(R7.1))). This, in conjunction 
with the fact that the states at the internal nodes having node-fixed homology structures could 

be used as “anchors,” the history component of x V n ]; {.s’(n)} N , w ; T j could be 

vertically decomposed into a direct product: 

r n ~ r / 

^ /0 [a[ 5l , 52 ,..., V ];{5(n)} NW ;r] =s Roo \ X H ,D a{s A (b), s D (b)) . - Eq.(SM-4.8) 

\ Ae Wr / 

Here, s A {b) and s n (b) for each branch are proper elements in the set of (the equivalence 
classes of) states, {y} M U {s(n)} NW . (All pairs, fs Soor , |m(&)| j’s, share the root state.) 

Substituting Eq.(SM-4.3) and Eq.(SM-4.8) into Eq.(SM-4.7), and lumping together the terms 
along each branch using Eq.(R4.9), we finally get: 
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— Eq.(SM-4.9) 


P[a[s 1 ,s 2 ,...,s JVI ];{s(n)} N „ | T] 

= p\^s Root ,n Root }j Y[ P[(a(s A (b),s D (b)),b) \ (s A (b),n A (b)) 

b£{b} T 

(It corresponds to Eq.(R7.6).) Here, 


P 
= P 


{a{s A {b),s D {b)),b)\{s A (b),n A {b)) 


(' a{s A {b),s D (b )), t(n A (b)},t(n D (bj} j t(n A (bj^ 


Eq.(SM-4.10) 




(, which corresponds to Eq.(R7.7),) is the probability of the ancestor-descendant PWA along 
branch b . This Eq.(SM-4.9) is basically the expression proposed in [13,14], and we 
demonstrated in effect that their proposal also holds even with a genuine stochastic 
evolutionary model. Usually, Eq.(SM-4.6) supplemented with Eq.(SM-4.9) is much more 
tractable than Eq.(SM-4.4) supplemented with Eq.(SM-4.3), because of the two reasons. (1) 
Usually, it is not the indel history (along the tree) but (the homology structure of) the set of 
ancestral sequence states that is inferred from a given MSA. (2) The probability of each indel 
history along the tree (Eq.(SM-4.3)) is not factorable in general, whereas Eq.(SM-4.9) is a 
product of PWA probabilities, each of which should be factorable if the conditions (i) and (ii) 
in section R6 are satisfied. 

Now, we seek to factorize the ab initio MSA probability into a form somewhat 
similar to Eq.(R6.7) for the ab initio PWA probability. In subsection 4.2 of [32], we did so 
using the history-based expansion of the MSA probability ( i.e ., Eq.(SM-4.4) supplemented 
with Eq.(SM-4.3)). Here, we will use the ancestral-state-based expansion {i.e., Eq.(SM-4.6) 
supplemented with Eq.(SM-4.9)), as was only briefly sketched at the bottom of subsection 4.2 
of [32] . In a MSA, gapless columns play almost the same role as PASs in a PWA. Because of 
the aforementioned “phylogenetic correctness” condition, a gapless column indicates that the 
site in question existed all across the phylogenetic tree, and thus that no indel event hit or 
pierce the site. Therefore, gapless columns will partition a MSA into regions each of which 
accommodates a local subset of every global history. Analogously to the argument above 
Eq.(SM-2.12), let C,, C 2 ,..., C K be the maximum possible set of such regions determined 
by a given MSA («[.v l , s 2 ,..., s n>: \) and a model setting (including tree T). (As argued there, 
all gapless columns are not necessarily needed to delimit the regions.) Meanwhile, if the 
conditions (i) and (ii) in section R6 are satisfied, each factor in the product in Eq.(SM-4.9) 
can be factorized as in Eq.(R6.7): 
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= p 


(a(s A (b),s D (b)),b)\(s A (b),n A (b))\ 

K ma x«» f ■ 

([],b)\(s A (b),n A m\ f] ^ P [[k ID [ YKb {b)-a{ S \b),s D {b))\b)\( S A (b),n A (b)) 


— Eq.(SM-4.11) 

Here we used the notation that helps easily remind the dependence on the branch ( b ). 
Especially, ly K (b)\ denotes the maximum set of regions accommodating local 

indel histories along b consistent with the PWA, a(s A (b),s D (b )) (Figure S2). Because the 
set of gapless columns delimiting {C K j defines a subset of PASs in a(s A (b),s D (b )) 


delimiting 


k< fc )L, 


x(*) 


each C K 


should encompass at least one y K/ (b) (Figure S2). 


Thus, Eq.(SM-4.9) supplemented with Eq.(SM-4.11) could be rearranged as: 
P[«[5 1 ,5 2 ,...,5 jvX ];{5(/|)} n;jv | r] 

')] n p [(n^)| ,« A (^))] f[J m,,[ a[ii,s 2 ,..„y];{s(n)} N „;C K | t]) 


= P [( 


Root Root 1 

s , n 


\be{b} T 


— Eq.(SM-4.12) 

Here, the “raw” multiplication factor contributed from the region, C K , is given by: 


Mp j^o:[ j'j, s N x ]> {^(^)} N /« ’ | -^] 


-n 

be{b} T 


f] ^[(a /Z 3 [ 7 ^(^); «(5- a (^),^ d (^))], | (^ A W, /7 a (^)) 


r Kb (b)cc K 


. — Eq.(SM-4.13) 


To factorize the total probability of a[s v s 2 ,..., s x ], Eq.(SM-4.6) (or Eq.(R7.5)), we need to 
consider multiple sets of ancestral states. For this purpose, we introduce a “reference” root 
sequence state, s * anl . It can be anything, as long as it is the state at the root consistent with 
«[.s p s 2 ,...,s x ], Technically, one good candidate for s^‘ nt would be a root state obtained by 
applying the Dollo parsimony principle [39] to each column of the MSA, because it is 
arguably the most readily available state that satisfies the phylogenetic correctness condition 
along the entire MSA. Given a reference, s ^ 001 , each ancestral state s A (b) should differ 
from Sq°° 1 only within some C K ’s. Moreover, the condition (ii) in section R6 guarantees 
that the impacts of their differences within separate C K ’s on the exit rate should be 
independent of each other. Thus, we have: 
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R?(s\b), t) = R?{s R °°\ t)+^™6Rf{s A {b), s R °°\ 0[C K ], —Eq.(SM-4.14) 


where 8R X (s (b), s 0 °°, /)[C K ] is the increment of the exit rate due to the difference 

between s A (b ) and s Root within the region C K . Remembering that 

'(A*)) 

4 w) 

right hand side of Eq.(SM-4.12) can be rewritten as: 

I] P[{U,b)\(s A (b),n A (b))\ 


([],£>) | ( s A {b ), n A {b )) = expf-J’^ R' x (s A (b), r)j , the product in the middle of the 


bE{b} T 


= P 


„Root „„Root 
Sq , ft 


{□}, ( 

Eq.(SM-4.15) 


K 

)]n 


bE{b} T 


ex p|- 2 fV^ drSR '° (s ‘ (b) ’ J »""’ t)[Ck1 


Here, P 


{□}, ( 


Root Root \ 

s 0 , n 


)|= cx p(“S/,£!„' R * {s * oot ' T) ) is the P robabilit y that 


>be{b} T J t(n A (b)) 


the sequence underwent no indel all across the tree ( T ), conditioned on that the state was 
s Ro ° l at the root. The remaining factor is the (prior) probability of the state at the root, 


(s Soof , n Root \ . We will impose a third condition: 

Condition (iii): 

P 


[s Root , n So °')] = P[(C°'> Y[Pp[s*°*, n *°*' C k] • — Eq.(SM-4.16) 

K=1 

(It corresponds to Eq.(R7.8).) Here the multiplication factor, u,, \^s R “'", s R '"" , n Roo, \ C K j 


represents the change in the state probability at the root due to the difference between 


and s Root within C K . This equation holds, e.g., when P ^s Root , n Root j 


is a geometric 


Root 

s 


distribution or a uniform distribution of the root sequence length, L(s Root ). [NOTE: HMMs 
commonly use geometric distributions of sequence lengths. The uniform distribution may be a 
good approximation if we can assume that the ancestral sequence was sampled randomly 
from a chromosome of length L c . In this case, the distribution of the sequence length 
L(s) (« L c ) would be proportional to (l-(L(,s)-l)/L c ) «1 .] Using Eqs.(SM-4.15,16), 
Eq.(SM-4.12) can be rewritten as: 
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P[a[s 1 ,s 2 ,...,s JVI ];{s(n)} NW | T] 

r nT "I / ^max 

- p[(s*“,»“")] p {[ ]} 7 1 (C. »*"“) [ a[i ' ■ i 2 -- V ]; {*(«)}„- ; C; QI r] 

\K=1 

— Eq.(SM-4.17) 

Here, the “augmented” multiplication factor contributed from C K is defined as: 

M P [a[5 1 ,5 2 ,...,5 JvX ];{5(/i)} N;jv ;5 0 /?oo ';C K | r] 

-M P [a[s 1 ,s 2 ,..., V ];{s(/i)} nW ;C k | r] ^[ S (n^),5 0 Soof ,n Soo? ;C K ] . — Eq.(SM-4.18) 

* ex p|- 2 flZ1 dTM '° (sA(b> - T)[C ^ ] 

Substituting Eq.(SM-4.17) into Eq.(SM-4.6) (or Eq.(R7.5)), we are just a step short of the 
complete factorization. The final step is the “decomposition” of the space, 

jn G N /A, (r)|; 7j , each of whose elements is a set of MSA-consistent 

ancestral states at all internal nodes. For this purpose, we use s^ 00 ' once again, and define 

A 2 s* oot ; G N /A, (T)}; T ] as the space of deviations of MSA-consistent 

internal states from s ^°°'. As argued above, the deviations of ancestral states from s*'"" 
come only from C K ’s (with K = l,...,K max ), and deviations from different C K ’s behave 
independently from each other (thanks to the delimiting gapless columns and conditions (i) 
and (ii)). Thus, we get the direct-product structure: 

A 2 [C'; a[s lf s 2 ,..., V ]; {n G N /A, (r)}; t] 

K max r r . i Eq.(SM-4.19) 

= x A 2 C k ; s* oot ; atSi.Sj,...,^,]; In G N W (T)}; T I 

K=1 

Here, A,[c k ; s ( f l0t ; a[s„s 2 ,..., V l; {n G N /JV (T)}; t] is the space of deviations within C K . 

In Eq.(SM-4.17), all the absolute dependences on s*" 01 were factored out of the product over 
K . Thus, in Eq.(SM-4.6) (or Eq.(R7.5)), the summation over 

l[a[ il ,^..., V ];{»GN ffl (I’)};I’] is reduced to the summation over 

A x [s*°°'; [n G N W (T)}; r]. Exploiting Eq.(SM-4.17) and Eq.(SM-4.19), 

Eq.(SM-4.6) can be re-expressed into the final factorized form: 
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K max 


P[a[s l ,s 2 ,...,s NX ]\T] = P 0 [sZ° o, | r]f]M p [«[5 p52 ,..., V ];C 0 ';C K | T] . - Eq.(SM-4.20) 

K=1 

(It corresponds to Eq.(R7.9).) Here, 

P 0 [s Root I T] = p\(s?*, n Root ) 1 p\{[]} T (s Roo> , n Root ) 


. — Eq.(SM-4.21) 


(, which corresponds to Eq.(R7.10),) is the probability of having a sequence state s 0 00 that 
has been intact all across tree T , and 

M P [«[y,y,,..., s nX ]- s Root -C K | T ] 

= J M P [a[y,5 2 ,...,5 A ,. Y ];{5(n)} N;A ,;5 0 Roo ';C K | r]. —Eq.(SM-4.22) 

{ S («)- So s “'} n;)v [C k ] 

is the multiplication factor contributed from all MSA-consistent local indel histories (along 
T ) confined in C K . [NOTE: M p [a[y, s 2 ,..., ]; s Root ; C K | 7"J given in Eq.(SM-4.22) 

should be equivalent to M p A® [C K ; «[y ,s 2 ,...,s nX f ]; r] | rj given in Eq.(4.2.9c) of [32], 
although the two expressions may appear quite different at first glance.] In Eq.(SM-4.22), we 
let {.?(/?) - s<f’ 0, } N „ v [C K ] denote the portion of the deviation of {.y(n)} N , v from s Ro °' 


confined in C K . 
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Supplementary figures (with legends) 


a Global indel history 

« 
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b Resulting MSA (in S n ) and local regions 
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c LHS(original representation): 

M = | M[k] = [<[*,1], M[£„Vj]J 

M[ 1] = [<(4,4), <(3,4)] = [<(4,4), <(3,4)], 
M[ 2] = [<(7,1), <(8,8)] = [<(4,1), <(5,5)], 
M[3] = [<(8,2), <(10,1)] = [<(7,2), <(8,1)]. 


d LHS (vector representation): 

= (<[r,], <r 2 ],...,<[r 7 ]) 

with - 

<yj=</ 2 ] = <y 4 ] - <r 7 ] = [] > 

<y 3 ] = <l]- M[y 5 ] = M[2], <y 6 ] = <3]. 


Figure SI. “Vector” representation of example LHS along time interval. 

a An example global indel history, consisting of six indel events and seven resulting sequence 
states (including the initial state s,). b The resulting MSA among the sequence states that the 
indel history went through. The boldface letters in the leftmost column indicate the sequence 
states in the global history (panel a). The 1-9,A-D in the cells are the ancestry indices of the 
sites. The cells shaded in magenta and red represent the sites to be deleted. Those shaded in 
cyan and blue represent the inserted sites. And those shaded in yellow represent the inserted 
sites to be deleted. Below the MSA, the bottom curly brackets indicate the regions y K 
{k = 3,5,6 in this example) that actually accommodate local indel histories. And the yellow 
wedges indicate the regions y K ( k = 1,2,4,7 in this example) that can potentially 
accommodate local indel histories, but that actually do not. In this example, K = 3 , 

N x = N 2 = N 3 = 2 , and /c max = 7 . c The original representation of the local history set (LHS). 
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In each defining equation for M[k] (k = 1,2,3), the expression in the middle is the local 

history represented by its action on the initial state (s ,). And on the right-most side is the 
representation by the actual indel events in the global history (in panel a), where the prime 
indicates that each defining event is equivalent to but not necessarily equal to the 
corresponding event in the global history, d The vector representation of the LHS. The “[] ” 
denotes an empty local history, in which no indel event took place. The figure was adapted 
from Figure 10 of [32]. 
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a Global indel history 


b Resulting MSA (in S n ) and local regions 



M(b5) = [M f (5,l)], 

M(/76) = [m,(10,1)], 

^(W) = [m d U1.1D,M d (4,5)], 
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c LHSs along branches (vector representation): 


M(62) = [m,( 5,1)], 

Sim-[ ], 

1§(M)-[m d (4,4)]. 
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y,(bl) y 2 (bl)y 3 (bl) y^bl) y 5 (M) y 6 (bl) y 7 (bl) y s (bl) y 9 (bl) y m (b\) y n (b\) 


M(bl) -(M[y,(61)],M[y n (M)]) -([],0,0 ,[m d ( 4,5)], [],[],[],[], [jW fl (ll,11)], [],[]). 


Similarly, 

M(b6) = (M[y,(fe6)],.... M[y u (66)]) = ([], [],[],[],[],[],[],[],[],[], [m,(10,1)], [], []), 
ti(.b 2) = (^ln(/>2)],.... i^[n 4 (62)]) = ([], [],[],[],[], [m/5,1)], [],[],[],[],[],[],[], []), 
M(b3) = (j&[y,(M)],..., l&[y l4 (W)]) = ([], [],[],[],[],[],[],[],[],[],[],[],[], []), 

M(M) = (iS[y,(M)],.... M[ 7 b (M)]) = ([], [],[], [m d (4,4)], U, [],[],[],[],[],[], [], []). 

d LHS along the tree (vector representation): 

|m( 1?)| = fjiif(6)j [C 1 ],...,|^(6)| [C I0 ] , 

with 

M(b) | [C K ]-{ } for K = 1,2,3,5,6,7,9,10, 

|m( 1»| [C 4 ] = {i&[y 6 (65)] = [m/ 5,1)], M[y 4 (fcl)] = [m d (4,5)], M[y 6 (b2)] = [m,( 5,1)], l&[y 4 (M)]-[M D (4,4)]J, 
|m(*)| [C s ] = {M[y„(fo6)] = [M,(10,l)], M[y,(W)] = [m d (1 1,11)]|. 


Figure S2. MSA regions potentially able to accommodate local indel histories along tree. 

a A global indel history along a tree. Sequence IDs are assigned to the nodes. Each branch is 
accompanied with an ID (bl - b6 ) and its own gobal indel history. The “ R ” stands for the 
root, b Resulting MSA of the “extant” sequences at external nodes and the ancestral 
sequences at internal nodes. The boldface letters in the leftmost column are the node IDs. 
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Below the MSA, the bottom curly brackets indicate regions C K (K = 4,8 in this example) 
that actually accommodate local indel histories along the tree, And the yellow wedges 
indicate the regions C K (K = 1,2,3,5,6,7,9,10 in this example) that can potentially 
accommodate local indel histories along the tree, but that actually do not. In this example, 
K max = 10 . c LHSs along the branches (in the vector representation). As examples, the PWAs 
along branches bl and b5 are also shown, along with their own potentially 
local-history-accommodating regions, d LHS along the tree (vector representation). Only the 
non-empty components were shown explicitly. 

The figure follows basically the same notation as Figure SI does. A cell in the MSA 
is shaded only if it is inserted/deleted along an adjacent branch. The figure was adapted from 
Figure 11 of [32] . 
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Rate Rate Rate 


a Regions of indel rate changes, and a moderate indel history 
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Figure S3. Example of the partially factorable indel model, Eqs.(R8-3.1,2). 

a Regions confining indel rate changes. In this panel, all indels are either completely within 














































































































outside of the regions. The graph above the MSA schematically indicates the indel rates of the 
regions. Indel rate changes are confined in two regions, E l and E 2 . Other than that, the 
figure uses the same notation as in Figure SI. Although the deletion of a site with ancestry ‘4’ 
and the deletion of a site with ancestry ‘6’ are separated by a PAS (with ancestry ‘5’), they are 
lumped together to form a single local indel history, because they are both contained in E l . b 
When a deletion sticks out of a region of changed indel rates. The deletion of the two sites 
(with ancestries ‘A’ and ‘B’) sticks out of region E 2 . In this case, y 6 is extended to 
encompass this deletion, and ends up engulfing the old y 7 and y 8 . All indel events within 
this new y 6 define a single local indel history, c When a deletion bridges two regions of 
changed indel rates. The deletion of the three sites (with ancestries ‘6,’ ‘7’ and ‘8’) bridges 
regions E l and /i 2 . In this case, E l and E 2 , as well as the spacer region between them, 
are put together to form a “meta-region” (the new y 4 ). And the indel events within the 
meta-region are lumped together to form a single local indel history. The figure was adapted 
from Figure 12 of [32]. 


28 


Supplementary table 


Table SI. Mathematical symbols common in this paper 

[NOTE: The symbols are arranged in the following order: Non-alphabetic symbols -> Roman 
alphabetic characters -> Greek alphabetic characters.] 


Symbol 

Description 

First 

occurrence 

(or definition) 

Non-alphabetic symbols 

(x | (bra) 

A bra-vector that represents the state x . (A 

bra-vector is an extension of a row-vector in 

the standard formulation.) 

Background; 

Supplementary 

appendix SA-1 

| y) (ket) 

A ket-vector that “accepts” the state y . (A 

ket-vector is an extension of a 

column-vector in the standard formulation.) 

Background; 

Supplementary 

appendix SA-1 

0 (hat) 

An operator that represents the action of 0 . 

(An operator is an extension of a matrix in 

the standard formulation.) 

Background; 

Supplementary 

appendix SA-1 

X ~ Y (tilde) 

X is equivalent to Y . 

In general 

Beginning with Roman alphabetic characters 

{b} T 

The set of all branches of the tree (T). 

Section R7, 2nd 

paragraph 

C V C 2 ,..., C K 

1 A ^max 

The maximum possible set of regions each 

of which can accommodate local indel 

histories consistent with the portion of a 

given MSA confined in the region. 

Section R7, 

above 

Eq.(R7.8) 

H ID (s 0 ) 

The set of all possible indel histories along a 

time axis (or a branch) that begin with the 

sequence state, s 0 . 

Section R7, 

above 

Eq.(R7.1) 

H id (N-,s 0 ) 

The set of all possible histories of N 

Section R4, 
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indels each along a time axis (or a branch) 

that begin with the sequence state, s 0 . 

Eq.(R4.6) 

H /fl 

[«(5 A ,5 D )] 

The set of all indel histories consistent with 

the PWA, a(s A ,s D ). 

Section R4, 

above 

Eq.(R4.9) 

H /d 

[7V;«(/, 5 d )] 

The set of all indel histories with N indels 

each that can result in the PWA, a(s A , s D ) . 

Section R4, 

Eq.(R4.8) 

I 

The identity operator. 

Section R3, 

Eq.(R3.18) 

L(s ) 

The length of a sequence in state s . 

Section R3 

m d \ 


The deletion of the subsequence between 

(and including) the x B -th and x £ -th sites. 

Section R2, 

Figure 3c 

Mj(x,1 ) 

The insertion of l sites between the x -th 

and (x + l)-th sites. 

Section R2, 

Figure 3b 

M v 

The v -th event in an indel history. 

Section R4, 

Eq.(R4.7) 

M = 


An indel history consisting of N indel 

events, M 1 , • • •, M N . 

Section R4, 

Eqs.(R4.6,7) 

M v (b) 

The v th event in an indel history along 

the branch, b . 

Section R7, 

Eq.(R7.1) 

M(b) 

An indel history along the branch, b . 

Section R7, 

Eq.(R7.1) 

{ 

M(fc)] 

t 

An indel history along the tree, T . 

Section R7, 

Eq.(R7.1) 


The operator representing the i k -th event in 

the k -th local indel history isolated from a 

global indel history. 

Section R5, 

Eq.(R5.4) 

7 

I 

kf- 

.L -1 J k=l,...,K 

A local history set (LHS) that consists of K 

local indel histories, which in isolation are: 

[M[k,l],...,M[k,N k ]\ with k = l,...,K. 

Section R5 

(2nd-last 

paragraph); 

Section R6, 

Eq.(R6.1) 

M[yJ 

A local indel history that can yield the 

portion of a given PWA confined in the 

Section R6, 

Eq.(R6.7) 
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region, y K . 


t 

M = 

(M[ r ,i .Mtr, % r _]) 

The vector representation of the LHS (M ), 

using the set of finest local regions, 

Section R6, 

above 

Eq.(R6.7) 


M 

LHS 

A local-history-set (LHS) equivalence class 

represented by the LHS, M (e.g., 

= {[M[k,l],...,M[k,N k ]]} ki k ). 

Section R6, 

Eq.(R6.1) 

N,( 

= {1,2,3,...}) 

The set of all positive integers. 

In general 


n [a(s A ,s D )] 

The minimum number of indels required for 

creating the PWA, a(s A , s D ) . 

Section R4, 

Eq.(R4.8) 

N ,n (T) 

The set of all internal nodes of the tree ( T ). 

Section R7, 2nd 

paragraph 

N x 

(-|N I(7,) |) 

The number of external nodes of the tree 

(T). 

Section R7, 2nd 

paragraph 

N*(r) v }) 

The set of all external nodes of the tree ( T). 

Section R7, 2nd 

paragraph 

{n} 

T (=N /A, (r) + N z (r)) 

The set of all nodes of the tree ( T). 

Section R7, 2nd 

paragraph 

n A (b) 

The “ancestral node” on the upstream end of 

the branch ( b ). 

Section R7, 2nd 

paragraph 

n D (b) 

The “descendant node” on the downstream 

end of the branch ( b ). 

Section R7, 2nd 

paragraph 

Root 

n 

The root node of a given tree. 

Section R7, 2nd 

paragraph 

p [( 

5, «)] 

The probability that the sequence is in state 

s at node n of the tree. 

Section R7, 

Eq.(R7.4) 

p [> 

f|y] 

The conditional probability that we have the 

outcome (X ) conditioned on Y . 

In general 

p [( 

s',t') \ (s,0] 

The conditional probability that the sequence 

is in state s’ at time t' conditioned on 

Section R3, 

Eq.(R3.17) 
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that it was in state s at time t . 


(= ex p{-/' F Jr<( 5o ,T)}) 

The probability that the sequence with an 

initial state, s 0 , underwent no indel during 

the time interval, [t n t F \ . 

Section R4, 

below 

Eq.(R4.7) 

f 0 [C'|r] 

The probability that the sequence was in 

state s*' °°‘ at the root and that it underwent 

no indels all across the tree ( T ). 

Section R7, 

Eq.(R7.10) 

P' D (t, t’) 

The finite-time transition operator of our 

indel evolutionary model, from time t to 

time t' . 

Section R3, 

Eq.(R3.17) 


= T |exp| j‘ dr 4 D (' r )j|. i-e., the 

operator describing the evolution from t' 

till t" with no indel. 

Section R4, 

Eq.(R4.4), 

below 

Eq.(SM-l .4) 

Q ,D (t)(=Q , (t) + Q D (t)] 

The total rate operator (at time t ) of our 

indel evolutionary model. 

Section R3, 

Eq.(R3.11) 

^(O (-&(*) + #(*)) 

The mutation-free part of the total rate 

operator ( Q ,D (t )). 

Section R4, 

Eq.(R4.1), 

Eq.(R4.2) 

e»(-ei«)+e») 

The part of the total rate operator ( Q lD (t )) 

describing the single-mutation transition 

between states. 

Section R4, 

Eq.(R4.1) 

e m (o=(&w+e"w) 

The component of the rate operator (at time 
t ) due to mutations of type m (= I or D). 

Section R3, 

Eq.(R3.2) 


The “mutation part” of the rate operator that 

describes the instantaneous transition (at 

time t ) via mutations of type 

m (= 1 or D). 

Section R3, 

Eq.(R3.2), 

Eqs.(R3.12, 13) 

Qx(t) 

The “exit rate part” of the rate operator that 

attenuates the state retention probability via 

mutations of type m (= I or D). 

Section R3, 

Eq.(R3.2), 

Eq.(R3.6) 

R*x(s,t ) s R' x (s,t) + R x (s,t) 

The total exit rate of the sequence state ( s ) 

at time t due to indels. 

Section R4, 

Eq.(R4.3) 

R'x(s,t) 

The component of the exit rate of the 

Section R3, 
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sequence state ( s ) at time t due to 
mutations of type m (= I or D j. 

Eqs.(R3.14, 15) 

r(M; s, t ) 

The rate of the mutation represented by M 

on the sequence in state s at time t . (In 

general, the rate depends on s and t .) 

Section R4, 

Eq.(R4.7); 

Eq.(SM-1.13) 

r D (x B ,x E \ s,t) 

The rate of deletion of the subsequence 

between (and including) the x B -th and 

x E -th sites, from the sequence (in state s ) 

at time t . (The rate generally depends on 

s and t .) 

Section R3 

(near the top) 

r,{x,l, s,t ) 

The rate of insertion of l sites between the 

x -th and ( x +1 )-th sites of the sequence (in 

state s ) at time t . (The rate generally 

depends on s and t .) 

Section R3 

(near the top), 

Eq.(R3.16) 


The space of all basic sequence states. 

Section R2 

s(=v = [v l ,v 2 ,...,v L ]) 

A basic sequence state (of length L ), in 

which each site ( x ) is assigned an ancestry 

( V x ) alone. 

Section R2, 

Figure 2c 

s = 

[(v 1 ,<a l \(v 2 ,<o 2 ),...,(v L ,<o L )\ 

An extended sequence state (of length L ), 

in which each site ( x) is assigned an 

ancestry ( V x ) and a residue (co x ). 

Section R2, 

Figure 2b 

s(n){^S") 

The sequence state at the node n E {n} T . 

Section R7, 2nd 

paragraph 

s\b)^s{n A {b))) 

The sequence state at the “ancestral node” 

on the upstream end of branch b . 

Section R7, 2nd 

paragraph 

s D {b)(=s(n D {b)^ 

The sequence state at the “descendant node” 

on the downstream end of branch b . 

Section R7, 2nd 

paragraph 

s = s(n ) 

The sequence state at the root node. 

Section R7, 3rd 

paragraph 

Root 

*0 

A “reference” root state. 

Section R7, 

above 

Eq.(R7.8) 

{s(n)} N » 

A set of ancestral states at all internal nodes. 

Section R7, 
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above 

Eq.(R7.5) 

r(=(w r ,m r )) 

A (rooted) phylogenetic tree. 

Section R7, 2nd 

paragraph 


The (summation of) time-ordered product(s). 

It rearranges the operators in each product in 

the temporal order so that the earliest 

operator comes leftmost. 

Section R3, 

Eq.(R3.18); 

Eq.(SA-l.ll) 

ILW«> 

The union of the sets (spaces), X(a )’s, 

which form a function on a space (set), A , 

over all elements (a ’s) in A . 

In general 

Beginning with Greek alphabetic characters 

a(s A , s D ) 

A PWA between the ancestral sequence 

(s A ) and the descendant sequence (s D ). 

Section R4, 

above 

Eq.(R4.8) 

^[^1 ? $2 9 * * * ’ ^ N x ^ 

A MSA among the sequence at the external 
nodes, s i = s(n,) G S 11 (rz ( . G N A ’ (T) ). 

Section R7, 

above 

Eq.(R7.4) 

K’Y2’-> 

The finest regions each of which can 

potentially accommodate local indel 

histories consistent with a given PWA. 

Section R6, 

above 

Eq.(R6.7) 

8R ! x d (s, s', t ) = 

R?(s,t)-R?(s',t) 

The difference of the exit rate of state s 

from that of state s' at time t . 

Section R6, 

condition (ii); 

Eq.(SM-2.7) 

® ID (b) 

The model parameters for the indel 

processes along the branch, b. 

Section R7, 2nd 

paragraph 

TZ 

v max 

The maximum possible number of the 

potentially local-history-accommodating 

regions consistent with a given MSA. 

Section R7, 

above 

Eq.(R7.8) 

^ max 

The number of the finest potentially 

local-history-accommodating regions 

consistent with a given PWA. 

Section R6, 

above 

Eq.(R6.7) 
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a id 

[a(.s A ,0] 


The set of all local history sets (LHSs) 

consistent with a PWA ( a(s A , s D ) ). 

Section R6, 

Eq.(R6.5) 

a id 

[y K ; cc(s A , 5°)] 

The set of local indel histories that can give 

rise to the sub-PWA of a(s A , s D ) confined 
in y K . 

Section R6, 

Eq.(R6.7) 

Mp 

a[s v s 2 ,.... 

Root, r-y 

3 0 ’ 

i-1 

1= ^ 

- 

The multiplication factor contributed from 

all local indel histories along the tree ( T) 

each of which can yield the portion of a 

MSA (alA,, s,,..., s, vl) confined in the 

region, C K . 

Section R7, 

Eq.(R7.9), 

below 

Eq.(R7.10) 

f-ip [ 

Root Root M Root. "1 

S ,5q ’^kJ 


The (multiplicative) change in the state 

probability at the root ( n Root ) due to the 
difference between the states, s Roo ‘ and 
s Root , within the region, C K . 

Section R7, 

Eq.(R7.8) 

F P 

•••9 

\ 

5 [ 5 5 tp ] 

/ 

i-1 

1 

rhe probability quotient (multiplication 

"actor) from the local indel history, 

Section R6, 

Eq.(R6.2), 

Eq.(R6.3) 

lip 

[ M ,[t n t F ] 

\L J LHS / 

i——i 

•+-i 

1 

rhe total probability quotient (multiplication 

"actor) from the LHS equivalence class, 

M 

Jins 

Section R6, 

Eq.(R6.2), 

Eq.(R6.4) 

n, 

F /(«) 

iElA 

The product of the values of a function, 

F(a ) , over all elements (a ’s) in the space 

(set), A . 

In general 


The summation of the values of a function, 

F(a ) , over all elements (a ’s) in the space 

(set) A . 

In general 

c 

i 

As ] ,s 2 ,...,s n 
n G N W (T)' 

r]; ' 

K 


The set of all {s(w)j w ’s ( i.e ., all sets of 

sequence states at internal nodes) that are 

consistent with the MSA, cds,,s,,,...,,s H , 

l z TV 

and the tree, 7 . 

Section R7, 

above 

Eq.(R7.5); 

above 

Eq.(SM-4.5) 
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Y 

The set of ancestry indices. 

Section R2 

w, (e Y) 

The ancestry index assigned to the x -th site 

of a sequence. 

Section R2 

v = [v l ,v 2 ,...,v L ] 

An array of ancestry indices assigned to the 

sites of a sequence (of length L ). 

Section R2, 

Figure 2c 

^ ,D [a[ 5p52 ,..., V ];r] 

The set of all pairs, fs* 00 ', |m(Z?)| j , 

defined on T that are consistent with 
the MSA, a[s 1 ,s 2 ,...,s liX '\. 

Section R7, 

above 

Eq.(R7.4); 

above 

Eq.(SM-4.4) 

Q 

An alphabet, or the set of all possible 

residues (such as 4 bases for DNA or 20 

amino acids for proteins). 

Section R1 

co,(EQ) 

The residue at the x -th site of a sequence. 

Section R1 

(b = [at 

An array of residues assigned to the sites of 

a sequence (of length L ). 

Section R1, 

Figure 2a 
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